{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# asyncio\n",
    "\n",
    "<https://zhuanlan.zhihu.com/p/25228075>\n",
    "\n",
    "asyncio是python 3.4中新增的模块，它提供了一种机制，使得你可以用协程（coroutines）、IO复用（multiplexing I/O）在单线程环境中编写并发模型。\n",
    "\n",
    "根据官方说明，asyncio模块主要包括了：\n",
    "\n",
    "- 具有特定系统实现的事件循环（event loop）;\n",
    "\n",
    "- 数据通讯和协议抽象（类似Twisted中的部分);\n",
    "\n",
    "- TCP，UDP,SSL，子进程管道，延迟调用和其他;\n",
    "\n",
    "- Future类;\n",
    "\n",
    "- yield from的支持;\n",
    "\n",
    "- 同步的支持;\n",
    "\n",
    "- 提供向线程池转移作业的接口;\n",
    "\n",
    "下面来看下asyncio的一个例子：\n",
    "\n",
    "注意：不要再jupyter里运行，最好用pycharm☺\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 示例一"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```py\n",
    "import asyncio\n",
    "\n",
    "async def compute(x, y):\n",
    "    print(\"Compute %s + %s ...\" % (x, y))\n",
    "    await asyncio.sleep(1.0)\n",
    "    return x + y\n",
    "\n",
    "async def print_sum(x, y):\n",
    "    result = await compute(x, y)\n",
    "    print(\"%s + %s = %s\" % (x, y, result))\n",
    "\n",
    "loop = asyncio.get_event_loop()\n",
    "loop.run_until_complete(print_sum(1, 2))\n",
    "loop.close()\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![](../assets/yield.png)\n",
    "\n",
    "当事件循环开始运行时，它会在Task中寻找coroutine来执行调度，因为事件循环注册了print_sum()，因此print_sum()被调用，执行result = await compute(x, y)这条语句（等同于result = yield from compute(x, y)），因为compute()自身就是一个coroutine，因此print_sum()这个协程就会暂时被挂起，compute()被加入到事件循环中，程序流执行compute()中的print语句，打印”Compute %s + %s …”，然后执行了await asyncio.sleep(1.0)，因为asyncio.sleep()也是一个coroutine，接着compute()就会被挂起，等待计时器读秒，在这1秒的过程中，事件循环会在队列中查询可以被调度的coroutine，而因为此前print_sum()与compute()都被挂起了，因此事件循环会停下来等待协程的调度，当计时器读秒结束后，程序流便会返回到compute()中执行return语句，结果会返回到print_sum()中的result中，最后打印result，事件队列中没有可以调度的任务了，此时loop.close()把事件队列关闭，程序结束。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 示例二"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```py\n",
    "# -*- coding: utf-8 -*-\n",
    "import asyncio\n",
    "from datetime import datetime\n",
    "# import nest_asyncio\n",
    "# nest_asyncio.apply()\n",
    "\n",
    "\n",
    "async def add(a, b):\n",
    "    await asyncio.sleep(1)\n",
    "    return a + b\n",
    "\n",
    "\n",
    "async def master_thread(loop):\n",
    "    print(\"{} master: 1+2={}\".format(datetime.now(), await add(1, 2)))\n",
    "\n",
    "\n",
    "def slave_thread(loop):\n",
    "    # 注意：这不是 coroutine 函数\n",
    "    import time\n",
    "    time.sleep(2)\n",
    "\n",
    "    f = asyncio.run_coroutine_threadsafe(add(1, 2), loop)\n",
    "    print(\"{} slave: 1+2={}\".format(datetime.now(), f.result()))\n",
    "\n",
    "\n",
    "async def main(loop):\n",
    "    await asyncio.gather(\n",
    "        master_thread(loop),\n",
    "        # 线程池内执行\n",
    "        loop.run_in_executor(None, slave_thread, loop),\n",
    "    )\n",
    "\n",
    "\n",
    "if __name__ == '__main__':\n",
    "    loop = asyncio.get_event_loop()\n",
    "    loop.run_until_complete(main(loop))\n",
    "    loop.close()\n",
    "    \n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 示例三\n",
    "\n",
    "下面代码中模拟了使用 2 个线程管理 event loop 的情况,\n",
    "\n",
    "其中子线程（thread 函数）设定并启用了一个 999 次的 task 任务。\n",
    "\n",
    "而在主线程中，添加了一个 5 次的 task 任务。我设置了一个 While 循环来检查这项任务是否已经运行完毕，如若完毕则打印出 main done: 5。\n",
    "\n",
    "子线程独立于主线程，直到运行完才退出。\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```py\n",
    "import asyncio\n",
    "import threading\n",
    "\n",
    "# 模拟协程\n",
    "async def task(c, i):\n",
    "    for _ in range(i):\n",
    "        print(c)\n",
    "        await asyncio.sleep(1)\n",
    "    return i\n",
    "\n",
    "def thread(loop):  # 异步程序\n",
    "    asyncio.set_event_loop(loop)\n",
    "    asyncio.ensure_future(task('sub thread', 999))\n",
    "    loop.run_forever()\n",
    "\n",
    "def main():\n",
    "    threading.Thread(target=thread, args=(asyncio.get_event_loop(), )).start()\n",
    "\n",
    "    # 同步代码开始\n",
    "    future = asyncio.ensure_future(task('main thread', 5))\n",
    "    while not future.done():\n",
    "        time.sleep(1)\n",
    "    print('main done: %s' % future.result())\n",
    "\n",
    "\n",
    "if __name__ == '__main__':\n",
    "    main()\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 示例三：豆瓣爬虫"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```py\n",
    "# -*- coding:utf-8 -*-\n",
    "from lxml import etree\n",
    "from time import time\n",
    "import asyncio\n",
    "import aiohttp\n",
    "\n",
    "url = 'https://movie.douban.com/top250'\n",
    "\n",
    "\n",
    "async def fetch_content(url):\n",
    "    async with aiohttp.ClientSession() as session:\n",
    "        async with session.get(url) as response:\n",
    "            return await response.text()\n",
    "\n",
    "\n",
    "async def parse(url):\n",
    "    page = await fetch_content(url)\n",
    "    html = etree.HTML(page)\n",
    "\n",
    "    xpath_movie = '//*[@id=\"content\"]/div/div[1]/ol/li'\n",
    "    xpath_title = './/span[@class=\"title\"]'\n",
    "    xpath_pages = '//*[@id=\"content\"]/div/div[1]/div[2]/a'\n",
    "\n",
    "    pages = html.xpath(xpath_pages)\n",
    "    fetch_list = []\n",
    "    result = []\n",
    "    # 第一页电影条目\n",
    "    for element_movie in html.xpath(xpath_movie):\n",
    "        result.append(element_movie)\n",
    "    # 获取分页地址url\n",
    "    for p in pages:\n",
    "        fetch_list.append(url + p.get('href'))\n",
    "    # 协程队列\n",
    "    tasks = [fetch_content(url) for url in fetch_list]\n",
    "    # 等待全部协程完成，获得每个协程的返回值（分页的html源码），列表顺序与传入顺序相同\n",
    "    pages = await asyncio.gather(*tasks)\n",
    "\n",
    "    for page in pages:\n",
    "        html = etree.HTML(page)\n",
    "        for element_movie in html.xpath(xpath_movie):\n",
    "            result.append(element_movie)\n",
    "    for i, movie in enumerate(result, 1):\n",
    "        # 输出： 序号 电影名\n",
    "        title = movie.find(xpath_title).text\n",
    "        # print(i, title)\n",
    "\n",
    "\n",
    "def main():\n",
    "    loop = asyncio.get_event_loop()\n",
    "    start = time()\n",
    "    # 重复请求5次（串行），模拟多任务\n",
    "    for i in range(5):\n",
    "        loop.run_until_complete(parse(url))\n",
    "    end = time()\n",
    "    print('Cost {} seconds'.format((end - start) / 5))\n",
    "    loop.close()\n",
    "\n",
    "\n",
    "if __name__ == '__main__':\n",
    "    main()\n",
    "\n",
    "```"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {
    "height": "calc(100% - 180px)",
    "left": "10px",
    "top": "150px",
    "width": "178.818px"
   },
   "toc_section_display": true,
   "toc_window_display": true
  },
  "varInspector": {
   "cols": {
    "lenName": 16,
    "lenType": 16,
    "lenVar": 40
   },
   "kernels_config": {
    "python": {
     "delete_cmd_postfix": "",
     "delete_cmd_prefix": "del ",
     "library": "var_list.py",
     "varRefreshCmd": "print(var_dic_list())"
    },
    "r": {
     "delete_cmd_postfix": ") ",
     "delete_cmd_prefix": "rm(",
     "library": "var_list.r",
     "varRefreshCmd": "cat(var_dic_list()) "
    }
   },
   "types_to_exclude": [
    "module",
    "function",
    "builtin_function_or_method",
    "instance",
    "_Feature"
   ],
   "window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
